AITopics | main effect

Collaborating Authors

main effect

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Generalized Functional ANOVA in Closed-Form: A Unified View of Additive Explanations

Ferrere, Baptiste, Bousquet, Nicolas, Gamboa, Fabrice, Loubes, Jean-Michel

arXiv.org Machine LearningMay-19-2026

The functional ANOVA, or Hoeffding decomposition, provides a principled framework for interpretability by decomposing a model prediction into main effects and higher-order interactions. For independent inputs, this classical decomposition is explicit. It is closely connected to SHAP values, generalized additive models, and orthogonal polynomial expansions, and therefore constitutes a fundamental tool for additive explainability. In the more general and realistic dependent setting, however, obtaining a tractable representation and estimating the decomposition from data remain challenging. In this work, we address this problem for continuous inputs. By combining Hilbert space methods with the generalized functional ANOVA, we build an explicit decomposition Riesz Basis allowing to easily compute the decomposition. Our formulation recovers the classical independent case and its associated orthogonal decomposition. Building on this representation, we propose a simple but mighty algorithm to estimate the decomposition from a data sample in a model-agnostic setting and we compare it empirically with several state-of-the-art explanation methods, demonstrating the power of the approach.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

2605.18422

Genre: Research Report > New Finding (0.46)

Industry: Energy > Power Industry (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

d60e14c19cd6e0fc38556ad29ac8fbc9-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 22:17:37 GMT

artificial intelligence, computation time, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Add feedback

473803f0f2ebd77d83ee60daaa61f381-Supplemental.pdf

Neural Information Processing SystemsFeb-8-2026, 07:05:21 GMT

interaction, interaction candidate, strength, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

A Compositional Kernel Model for Feature Learning

Ruan, Feng, Liu, Keli, Jordan, Michael

arXiv.org Artificial IntelligenceNov-5-2025

Deep learning has achieved remarkable success across domains such as vision, language, and science. A widely believed explanation for this success is representation learning -- also called feature learning -- the empirically observed ability of deep models to automatically extract task-relevant features from raw data, without manual engineering, to support downstream prediction [1]. This ability is generally attributed to two fundamental ingredients of deep models: (i) their compositional architecture and (ii) the use of optimization. The compositionality of the architecture endows the model with the ability to form intermediate representations of the data via composition of simple transformations. These representations are not manually defined but are learned from data by optimizing a loss function designed to minimize prediction error. However, despite the empirical success of this paradigm, our theoretical understanding of how and why such representations emerge remains fundamentally limited. In particular, it remains unclear how the interplay between compositional structure and optimization gives rise to task-aligned features -- and under what conditions this mechanism succeeds or fails. To address this gap, we study a stylized compositional model that preserves these two core ingredients of feature learning -- while remaining simple enough to enable analysis of how features are learnt during training.

artificial intelligence, machine learning, stationary point, (15 more...)

arXiv.org Artificial Intelligence

2509.14158

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Tree Ensemble Explainability through the Hoeffding Functional Decomposition and TreeHFD Algorithm

Bénard, Clément

arXiv.org Machine LearningOct-30-2025

Tree ensembles have demonstrated state-of-the-art predictive performance across a wide range of problems involving tabular data. Nevertheless, the black-box nature of tree ensembles is a strong limitation, especially for applications with critical decisions at stake. The Hoeffding or ANOVA functional decomposition is a powerful explainability method, as it breaks down black-box models into a unique sum of lower-dimensional functions, provided that input variables are independent. In standard learning settings, input variables are often dependent, and the Hoeffding decomposition is generalized through hierarchical orthogonality constraints. Such generalization leads to unique and sparse decompositions with well-defined main effects and interactions. However, the practical estimation of this decomposition from a data sample is still an open problem. Therefore, we introduce the TreeHFD algorithm to estimate the Hoeffding decomposition of a tree ensemble from a data sample. We show the convergence of TreeHFD, along with the main properties of orthogonality, sparsity, and causal variable selection. The high performance of TreeHFD is demonstrated through experiments on both simulated and real data, using our treehfd Python package (https://github.com/ThalesGroup/treehfd). Besides, we empirically show that the widely used TreeSHAP method, based on Shapley values, is strongly connected to the Hoeffding decomposition.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2510.24815

Country:

North America > United States (0.28)
Europe (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
(3 more...)

Add feedback

Toward Understanding the Transferability of Adversarial Suffixes in Large Language Models

Ball, Sarah, Hasrati, Niki, Robey, Alexander, Schwarzschild, Avi, Kreuter, Frauke, Kolter, Zico, Risteski, Andrej

arXiv.org Artificial IntelligenceOct-28-2025

Discrete optimization-based jailbreaking attacks on large language models aim to generate short, nonsensical suffixes that, when appended onto input prompts, elicit disallowed content. Notably, these suffixes are often transferable -- succeeding on prompts and models for which they were never optimized. And yet, despite the fact that transferability is surprising and empirically well-established, the field lacks a rigorous analysis of when and why transfer occurs. To fill this gap, we identify three statistical properties that strongly correlate with transfer success across numerous experimental settings: (1) how much a prompt without a suffix activates a model's internal refusal direction, (2) how strongly a suffix induces a push away from this direction, and (3) how large these shifts are in directions orthogonal to refusal. On the other hand, we find that prompt semantic similarity only weakly correlates with transfer success. These findings lead to a more fine-grained understanding of transferability, which we use in interventional experiments to showcase how our statistical analysis can translate into practical improvements in attack success.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.22014

Country: Europe > Germany (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Interaction Concordance Index: Performance Evaluation for Interaction Prediction Methods

Pahikkala, Tapio, Numminen, Riikka, Movahedi, Parisa, Karmitsa, Napsu, Airola, Antti

arXiv.org Machine LearningOct-17-2025

Consider two sets of entities and their members' mutual affinity values, say drug-target affinities (DTA). Drugs and targets are said to interact in their effects on DTAs if drug's effect on it depends on the target. Presence of interaction implies that assigning a drug to a target and another drug to another target does not provide the same aggregate DTA as the reversed assignment would provide. Accordingly, correctly capturing interactions enables better decision-making, for example, in allocation of limited numbers of drug doses to their best matching targets. Learning to predict DTAs is popularly done from either solely from known DTAs or together with side information on the entities, such as chemical structures of drugs and targets. In this paper, we introduce interaction directions' prediction performance estimator we call interaction concordance index (IC-index), for both fixed predictors and machine learning algorithms aimed for inferring them. IC-index complements the popularly used DTA prediction performance estimators by evaluating the ratio of correctly predicted directions of interaction effects in data. First, we show the invariance of IC-index on predictors unable to capture interactions. Secondly, we show that learning algorithm's permutation equivariance regarding drug and target identities implies its inability to capture interactions when either drug, target or both are unseen during training. In practical applications, this equivariance is remedied via incorporation of appropriate side information on drugs and targets. We make a comprehensive empirical evaluation over several biomedical interaction data sets with various state-of-the-art machine learning algorithms. The experiments demonstrate how different types of affinity strength prediction methods perform in terms of IC-index complementing existing prediction performance estimators.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

2510.14419

Country:

Europe > Finland > Southwest Finland > Turku (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.67)

Add feedback

Appendices A The Persistence Interaction Detection Algorithm

Neural Information Processing SystemsOct-9-2025, 14:16:51 GMT

Algorithm 1: The proposed Persistence Interaction Detection (PID) algorithmInput: A trained feed-forward neural network, target layer l, norm p. Output: ranked list of interaction candidates {I Our PID framework is presented in Algorithm 1. PID in all experiments of this paper (i.e., set η as 0). In this subsection, we will prove Theorem 1 and evaluate it empirically. We have the following corollary: Corollary 1. |b Combining them together finishes the proof. It is trivial to show that Corollary 1 can be extended to the death time, i.e., we also have After proving Corollary 1, we return to prove the theorem. In this section, first, we show how to extend PID to CNNs.

artificial intelligence, interaction, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Sparse Deep Additive Model with Interactions: Enhancing Interpretability and Predictability

Hung, Yi-Ting, Lin, Li-Hsiang, Calhoun, Vince D.

arXiv.org Machine LearningSep-30-2025

Recent advances in deep learning highlight the need for personalized models that can learn from small or moderate samples, handle high dimensional features, and remain interpretable. To address this challenge, we propose the Sparse Deep Additive Model with Interactions (SDAMI), a framework that combines sparsity driven feature selection with deep subnetworks for flexible function approximation. Unlike conventional deep learning models, which often function as black boxes, SDAMI explicitly disentangles main effects and interaction effects to enhance interpretability. At the same time, its deep additive structure achieves higher predictive accuracy than classical additive models. Central to SDAMI is the concept of an Effect Footprint, which assumes that higher order interactions project marginally onto main effects. Guided by this principle, SDAMI adopts a two stage strategy: first, identify strong main effects that implicitly carry information about important interactions. second, exploit this information through structured regularization such as group lasso to distinguish genuine main effects from interaction effects. For each selected main effect, SDAMI constructs a dedicated subnetwork, enabling nonlinear function approximation while preserving interpretability and providing a structured foundation for modeling interactions. Extensive simulations with comparisons confirm SDAMI$'$s ability to recover effect structures across diverse scenarios, while applications in reliability analysis, neuroscience, and medical diagnostics further demonstrate its versatility in addressing real-world high-dimensional modeling challenges.

interaction, main effect, sdami, (16 more...)

arXiv.org Machine Learning

2509.23068

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Sensitivity of Variational Bayesian Neural Network Performance to Hyperparameters

Koermer, Scott, Klein, Natalie

arXiv.org Machine LearningSep-26-2025

In scientific applications, predictive modeling is often of limited use without accurate uncertainty quantification (UQ) to indicate when a model may be extrapolating or when more data needs to be collected. Bayesian Neural Networks (BNNs) produce predictive uncertainty by propagating uncertainty in neural network (NN) weights and offer the promise of obtaining not only an accurate predictive model but also accurate UQ. However, in practice, obtaining accurate UQ with BNNs is difficult due in part to the approximations used for practical model training and in part to the need to choose a suitable set of hyperparameters; these hyperparameters outnumber those needed for traditional NNs and often have opaque effects on the results. We aim to shed light on the effects of hyperparameter choices for BNNs by performing a global sensitivity analysis of BNN performance under varying hyperparameter settings. Our results indicate that many of the hyperparameters interact with each other to affect both predictive accuracy and UQ. For improved usage of BNNs in real-world applications, we suggest that global sensitivity analysis, or related methods such as Bayesian optimization, should be used to aid in dimensionality reduction and selection of hyperparameters to ensure accurate UQ in BNNs.

data generating mechanism, divergence, hyperparameter, (14 more...)

arXiv.org Machine Learning

2509.20574

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Energy (0.93)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback